Skip to content

Conversation

@jimburtoft
Copy link
Contributor

To replace the MaxPool2D function. Not sure if it is faster than a traced pytorch version or not.

However, it does show an interesting use of masking to avoid extra memory writes. (instead of padding with -inf rows and columns on every edge, I just adjust my indices and mask the values for the columns I didn't insert).

All tests are included in the code.

Testing:

Please see detailed unit test requirements in the CONTRIBUTING.md

  • [ x] The change is covered by numeric check using nki.baremetal
  • [ x] The change is covered by performance benchmark test using nki.benchmark
  • The change is covered by end-to-end integration test

Pull Request Checklist

  • [ x] I have filled in all the required field in the template
  • [ x] I have tested locally that all the tests pass
  • [ x] By submitting this pull request, I confirm that my contribution is made under the terms of the MIT-0 license.

sz_cin, sz_hin, sz_win = in_tensor.shape
sz_hout = (sz_hin + 2*padding - kernel_size) // stride + 1
sz_wout = (sz_win + 2*padding - kernel_size) // stride + 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add assertions on expectations for the shape and parameter values here.

sz_p = sz_cin

# Generate pool index patterns with stride
i0 = nl.arange(sz_p)[:, None, None, None, None] # Channel dim
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use mgrid for this

i4 = nl.arange(kernel_size)[None, None, None, None, :] # Pool width

# Load input data
in_tile: tensor[sz_p, sz_hin, sz_win] = nl.load(in_tensor)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per the docs here, your partition dimension must be the first dimension. These should be 2d tiles. We're deprecating block dimension on SBUF.

https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/nki/api/generated/nki.language.load.html#nki.language.load

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JonathanHenson The tiling was based on the average_pool2D example.

in_tile: tensor[sz_p, sz_hin, sz_win] = nl.load(in_tensor)

Will that be updated any time soon?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants